home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
ShareWare OnLine 2
/
ShareWare OnLine Volume 2 (CMS Software)(1993).iso
/
database
/
msgobj10.zip
/
PIPBASE.DOC
< prev
next >
Wrap
Text File
|
1993-03-26
|
14KB
|
322 lines
Hello, World!
Once upon a time, my cosysop Elio Fenoglio told me: "Idea *:-) Why don't we
create a new standard for message base, that minimizes the space occupied on
disks by storing messages in a compressed form, and that eliminates all
limits of the Hudson QuickBase?". My first answer was: "No, it's too
difficult! and I cannot impose a new standard."
In the meantime, I was improving AtomAnt, a message reader for the Cambridge
Computer Z88 that reads messages directly from a .pkt (there's no space for
tossing, in its 160 K of static RAM). To maximize the amount of messages I
could read when I was travelling by train, I though "Why don't read messages
directly from a compressed .MO0? using a simpler packer than ARC, it's not a
hard work." After some days, I created Z88PIP, a message packer for PC IBM,
and AtomAnt 1.02, the first reader that reads messages directly from a
compressed packet. Happy for this, I gone to bed.
It was about 2:00AM, when I woke up with the enlightment: "BUT I REALIZED IT:
A COMPRESSED MESSAGE BASE!". After some all-nighters, I projected the PipBase
message format, and I started to develop all necessary programs: an editor, a
mail processor, and some utilities to do maintenance.
Thanks to Elio Fenoglio (my cosysop and author of PipSetup), Giovanni Lopes
Pegna (author of Mercurio), Marco Maccaferri (author of Lora), Marco Russo
(author of some interesting libraries) and Stefano Pasquini (the first who
forwarded my PipBase out of Italy), the idea spread around the world.
Special thanks also to all my beta testers and to Paolo Rosa, for their
encouragements and suggestions.
PIP BASE FORMAT:
In this document I'll try to define Pip Message Base format.
I suggest you to get the MessageObject API (filerequest as MSGOBJ*.* from
Pip* support BBSes), and to take a look at header files and procedures (note
that this document is also included in MessageObject distribution package).
I suppose that you're not novices in computer programming.
CONVENTIONS:
- all integers are 16 bits, in Intel format
- long integers are 32 bits, Intel format
- datas are aligned on bytes, unless otherwise specified
- "uint" means "unsigned int"
BASICS:
- All Pip Base files reside in a single directory (except some configuration
files that are generated and needed by specific programs); this directory
will be referred to as the "message base directory"
- Every message area is contained into two files: a text file (MPKTxxxx.PIP)
and an index file (MPTRxxxx.PIP); xxxx is an hexadecimal integer,
right-justified and zero-padded, indicating area number. Areas are numbered
from 0 to 32767 (from M???0000.PIP to M???7FFF.PIP).
- To maintain smaller configuration and lasterad files, your system may limit
the number of available areas. Pip* allows you to choose the area limit
between 64, 256, 1024, 4096 and 32768 areas, and it will be a good idea to
look at Pip* configuration files...
- An area description file, BASEDESC.PIP is stored in the same directory; all
programs should refer to this file to configure themselves; the number of
records in this file is the total number of areas in your systems; unused
areas will have their respective record in BASEDESC.PIP zero-filled.
- An index file to rapidly find messages for a given user is called
DESTPTR.PIP and is stored in the messagebase directory.
- A lastread pointers file, called LASTREAD.PIP and stored in the messagebase
directory, contains lastread informations.
FILE STRUCTURE:
- MPTRxxxx.PIP: a file of records of type MSGPTR, indexing area xxxx.
record structure follows:
typedef struct /* structure of each record in MPTRxxxx.PIP files */
{ long pos; /* pointer to MPKTxxxx.PIP */
uint prev,next; /* pointers to other records in MPTRxxxx */
uint status; /* bit 0=deleted 1=received 2=sent */
/* 3=fromus(1)/tous(0) */
/*4=Locked (Undeletable) */
} MSGPTR;
pos points to the offset, in relative MPKTxxxx.PIP file, where the message
header starts.
- MPKTxxxx.PIP: this file is composed by a serie of messages, follwed by two
NUL bytes (the two NUL bytes are a sort of EOF identifier).
Each message is composed by:
- an header with the following structure:
typedef struct // structure of the message headers in MPKTxxxx.PIP files
{ uint pktype; /* 2= not compressed; 10=compressed with PIP
uint fromnode,tonode,fromnet,tonet; /* for netmail
uint attribute; /* bit 0=private as for SeaDog,
1=crash as for SeaDog,
2=received as for SeaDog,
3=sent as for SeaDog,
4=fileattach as for SeaDog,
5=in transit as for SeaDog,
7=kill/sent as for SeaDog,
8=local as for SeaDog,
9=hold as for SeaDog,
10=locked,
11=filerequest as for SeaDog,
12=Return Receipt request,
13=Is Return Receipt,
14=Audit Request,
15=fileupdaterequest as for SeaDog */
uint point; // reserved
} MSGPKTHDR;
- a nul-terminated string, 19 characters long (20, including the
ending zero), containing the date of the message in the format
- day (two digits, right-justified and zero-padded)
- a space (ASCII decimal code 32)
- month (three letters: Jan/Feb/Mar/etc...)
- a space
- year (four digits, right-justified and zero-padded)
- two spaces
- hour (two digits, right-justified and zero-padded)
- a colon
- minute (two digits, right-justified and zero-padded)
- a colon
- seconds (two digits, right-justified and zero-padded)
- a nul-terminated string, of variable length (up to 35 characters long,
36 including trailing zero), containing the name of the recipient of the
message
- a nul-terminated string, of variable length (up to 35 characters long,
36 including trailing zero), containing the name of the sender of the
message
- a nul-terminated string, of variable length (up to 71 characters long,
72 including trailing zero), containing the name of the sender of the
message
- a nul-terminated string, of variable length (theorically unbounded, but
every mailer may impose a specific limit, due to internal limitations;
PipBase 1.00, my first mail processor, used a 30000 bytes limit; Pip*
2.00 uses a length limited only by memory), containing message text.
- if the field pktype of the header was 2, the text is a regular,
uncompressed, string
- if pktype was 10, the string is compressed using this fixed table:
Code Meaning
0 Text terminator
1..127 regular ASCII characters
128 quote character: to represent non-7-bits characters
in two-bytes form: take the character that follows
the 128, add 127 and you'll obtain the character
129 "SEEN-BY: "
130 "MSGID: "
131 "PATH: "
132 ": "
133 "zion"
134 "ment"
135 "---"
136 "che"
137 "chi"
138 "ghe"
139 "ghi"
140 "str"
141 ""
142 "il"
143 "al"
144 "ed"
145 "pr"
146 "st"
147 ".."
148 " "
149 ", "
150 ". "
151 "; "
152 "++"
153 "a'"
154 "e'"
155 "i'"
156 "o'"
157 "u'"
158 "a "
159 "e "
160 "i "
161 "o "
162 "u "
163 "nt"
164 "hi"
165 "bb"
166 "ba"
167 "be"
168 "bi"
169 "bo"
170 "bu"
171 "cc"
172 "ca"
173 "ce"
174 "ci"
175 "co"
176 "cu"
177 "dd"
178 "da"
179 "de"
180 "di"
181 "do"
182 "du"
183 "ff"
184 "fa"
185 "fe"
186 "fi"
187 "fo"
188 "fu"
189 "gg"
190 "ga"
191 "ge"
192 "gi"
193 "go"
194 "gu"
195 "ll"
196 "la"
197 "le"
198 "li"
199 "lo"
200 "lu"
201 "mm"
202 "ma"
203 "me"
204 "mi"
205 "mo"
206 "mu"
207 "nn"
208 "na"
209 "ne"
210 "ni"
211 "no"
212 "nu"
213 "pp"
214 "pa"
215 "pe"
216 "pi"
217 "po"
218 "pu"
219 "rr"
220 "ra"
221 "re"
222 "ri"
223 "ro"
224 "ru"
225 "ss"
226 "sa"
227 "se"
228 "si"
229 "so"
230 "su"
231 "tt"
232 "ta"
233 "te"
234 "ti"
235 "to"
236 "tu"
237 "vv"
238 "va"
239 "ve"
240 "vi"
241 "vo"
242 "vu"
243 "zz"
244 "za"
245 "ze"
246 "zi"
247 "zo"
248 "zu"
249 "=="
250 ":-"
251 "' "
252 "ha"
253 "ho"
254 "qu"
255 no meaning. Pip* translates it with a #
this compression method is not adaptive, but it works quite well on
"normal" text (i.e.: text where you do not use high-ASCII characters
and you do not SHOUT WITH ALL-CAPS PHRASES); adaptive algorithms can
do better results in compression rates, but they're too slow to be
used in a message reader.
- CRs (ASCII decimal code 13) are used to indicate <hard> CRs (i.e.:
paragraph ends)
- ASCII code 141 is used to indicate a "soft CR", i.e.: where a line
ends due to word-wrapping
- LFs (ASCII decimal code 10) are ignored
- kludges are, as usual, preceeded by an ASCII code 1
- DESTPTR.PIP: this file is composed by a sequence of records in the following
format:
typedef struct /* structure of each record in DESTPTR.PIP */
{ char to[36]; /* addressee name */
uint area,msg; /* pointers to MPTRarea.PIP records */
long unused; /* reserved for future use */
} DESTPTR;
- BASEDESC.PIP: this file is a sequence of records (a record for each area;
record number is the area number) of the following format:
typedef struct /* structure for each record of BASEDESC.PIP */
{ char descr[40], /* description of the area */
tag[30]; /* echomail tag, #LOCAL, #BAD,#DUPES or #NETMAIL */
uint nrmsgs,days; /* to perform PURGE; nrmsgs=0: passthru area */
unsigned char killrcv, /* kill received messages? */
readlevel,readflags[4],writelevel,writeflags[4], /* for Remote Access style access control */
origin, address, /* index for the appropriate table */
note[80], /* whatever you want */
forward_to[32]; /* this is a bitmap on FRIENDND.PIP's 256 records */
char origmode; /* 0=fixed origin; 1=random system; 2=cyclic system */
int startorig,endorig; /* for random or cyclic origin selection */
long inmsgmonth,inmsgyear; /* for statistics */
long outmsgmonth,outmsgyear; /* for statistics */
char expansion_box[36]; /* please apply to define this */
} AREASTRUCT;
- LASTREAD.PIP: it is a large array of [number_of_users][number_of_areas]
integers (message numbers); simply, lastread[user#][area#] is the last
message read by the specified user. Usually, user#0 is the system operator.
Number_of_areas is the maximum number of areas that you allow in your
system.
And that's all, folks.
Send any suggestion, contribute, thanks, criticisms to:
Roberto Piola
2:334/108.57@fidonet.org